Learning Näıve Bayes Tree for Conditional Probability Estimation

نویسندگان

  • Han Liang
  • Yuhong Yan
چکیده

Näıve Bayes Tree uses decision tree as the general structure and deploys näıve Bayesian classifiers at leaves. The intuition is that näıve Bayesian classifiers work better than decision trees when the sample data set is small. Therefore, after several attribute splits when constructing a decision tree, it is better to use näıve Bayesian classifiers at the leaves than to continue splitting the attributes. In this paper, we propose a learning algorithm to improve the conditional probability estimation in the diagram of Näıve Bayes Tree. The motivation for this work is that, for cost-sensitive learning where costs are associated with conditional probabilities, the score function is optimized when the estimates of conditional probabilities are accurate. The additional benefit is that both the classification accuracy and Area Under the Curve (AUC) could be improved. On a large suite of benchmark sample sets, our experiments show that the CLL tree outperforms the state-of-art learning algorithms, such as Näıve Bayes Tree and näıve Bayes significantly in yielding accurate conditional probability estimation and improving classification accuracy and AUC.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparative Analysis of Methods for Probability Estimation Tree

In this paper, we address the problem of probability estimation of decision trees. This problem has received considerable attention in the areas of machine learning and data mining, and techniques to use tree models as probability estimators have been suggested. We make a comparative study of six well-known class probability estimation methods, measured by classification accuracy, AUC and Condi...

متن کامل

Conditional Independence Trees

It has been observed that traditional decision trees produce poor probability estimates. In many applications, however, a probability estimation tree (PET) with accurate probability estimates is desirable. Some researchers ascribe the poor probability estimates of decision trees to the decision tree learning algorithms. To our observation, however, the representation also plays an important rol...

متن کامل

Bayes Networks and Fault Tree Analysis Application in Reliability Estimation (Case Study: Automatic Water Sprinkler System)

In this study, the application of Bayes networks and fault tree analysis in reliability estimation have been investigated. Fault tree analysis is one of the most widely used methods for estimating reliability. In recent years, a method called "Bayes Network" has been used, which is a dynamic method, and information about the probable failure of the system components will be updated according to...

متن کامل

Bayesian network classifiers which perform well with continuous attributes: Flexible classifiers

When modelling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous works have solved the problem by discretizing them with the consequent loss of information. Another common alternative assumes that the data are generated by a Gaussian distribution (parametric approach), such as conditional Gaussian networks, wit...

متن کامل

An Empirical Study on Class Probability Estimates in Decision Tree Learning

Decision tree is one of the most effective and widely used models for classification and ranking and has received a great deal of attention from researchers in the domain of data mining and machine learning. A critical problem in decision tree learning is how to estimate the classmembership probabilities from decision trees. In this paper, we firstly survey all kinds of class probability estima...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005